Reducing the Synchronization Overhead in Parallel Nonsymmetric Krylov Algorithms on MIMD Machines
نویسندگان
چکیده
By considering electromagnetic scattering problems as examples, a study of the performance and scalability of the conjugate gradient squared (CGS) algorithm on two MIMD machines is presented. A modified CGS (MCGS) algorithm, where the synchronization overhead is effectively reduced by a factor of two, is proposed in this paper. This is achieved by changing the computation sequence in the CGS algorithm. Both experimental and theoretical analyses were performed to investigate the impact of this modification on the overall execution time.
منابع مشابه
Mcgs: a Modiied Conjugate Gradient Squared Algorithm for Nonsymmetric Linear Systems
The conjugate gradient squared (CGS) algorithm is a Krylov subspace algorithm that can be used to obtain fast solutions for linear systems (Ax = b) with complex nonsymmetric, very large, and very sparse coeecient matrices (A). By considering electromagnetic scattering problems as examples, a study of the performance and scalability of this algorithm on two MIMD machines is presented. A modiied ...
متن کاملMCGS A Modi ed Conjugate Gradient Squared Algorithm for Nonsymmetric Linear Systems
The conjugate gradient squared CGS algorithm is a Krylov subspace algorithm that can be used to obtain fast solutions for linear systems Ax b with complex nonsymmetric very large and very sparse coe cient matrices A By considering electromagnetic scattering problems as examples a study of the performance and scalability of this algorithm on two MIMD machines is presented A modi ed CGS MCGS algo...
متن کاملOptimizing the Emulation of MIMD Behavior on SIMD Machines
SIMD computers have proved to be a useful and cost eeective approach to massively parallel computation. On the other hand, there are algorithms which are very ineecient when directly translated into a data-parallel program. This paper presents a number of simple transformations which are able to reduce this SIMD overhead to a moderate constant factor. In particular , this factor is often much s...
متن کاملEecient Emulation of Mimd Behavior on Simd Machines
SIMD computers have proved to be a useful and cost eeective approach to massively parallel computation. On the other hand, there are algorithms which are very ineecient when directly translated into a data-parallel program. This paper presents a number of simple transformations which are able to reduce this SIMD overhead to a moderate constant factor. It also introduces techniques for reducing ...
متن کاملA Parallel Algorithm for Connected Components on Distributed Memory Machines
Finding connected components (CC) of an undirected graph is a fundamental computational problem. Various CC algorithms exist for PRAM models. An implementation of a PRAM CC algorithm on a coarse-grain MIMD machine with distributed memory brings many problems, since the communication overhead is substantial compared to the local computation. Several implementations of CC algorithms on distribute...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998